A Bayesian Approach to Discretization
نویسندگان
چکیده
The performance of many machine learning algorithms can be substantially improved with a proper discretization scheme. In this paper we describe a theoretically rigorous approach to discretization of continuous attribute values, based on a Bayesian clustering framework. The method produces a probabilistic scoring metric for diierent discretizations, and it can be combined with various types of learning algorithms working on discrete data. The approach is validated by demonstrating empirically the performance improvement of the Naive Bayes classiier when Bayesian discretization is used instead of the standard equal frequency interval discretization.
منابع مشابه
A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملA Novel Discretization for Parameter Learning in Bayesian Network using Dynamic Programming
In AI and machine learning techniques such as decision trees and Bayesian networks, there is a growing need for converting continuous data into discrete form. Several approaches are available for discretization, however finding an appropriate and efficient discretization method is a challenging task. In this paper, we present an impurity based dynamic multi-interval discretization approach for ...
متن کاملA Bayesian Approach to
The performance of many machine learning algorithms can be substantially improved with a proper discretization scheme. In this paper we describe a theoretically rigorous approach to discretization of continuous attribute values, based on a Bayesian clustering framework. The method produces a probabilistic scoring metric for diierent discretizations, and it can be combined with various types of ...
متن کاملA Bayesian approach for supervised discretization
In supervised machine learning, some algorithms are restricted to discrete data and thus need to discretize continuous attributes. In this paper, we present a new discretization method called MODL, based on a Bayesian approach. The MODL method relies on a model space of discretizations and on a prior distribution defined on this model space. This allows the setting up of an evaluation criterion...
متن کاملAn Iterative Improvement Approach for the Discretization of Numeric Attributes in Bayesian Classifiers
The Bayesian classifier is a simple approach to classification that produces results that are easy for people to interpret. In many cases, the Bayesian classifier is at least as accurate as much more sophisticated learning algorithms that produce results that are more difficult for people to interpret. To use numeric attributes with Bayesian classifier often requires the attribute values to be ...
متن کامل